MERT-v1-330M is an advanced music understanding model trained based on the MLM paradigm, with 330M parameters, supporting a 24K Hz audio sampling rate and 75 Hz feature rate, suitable for various music information retrieval tasks.
Audio Classification
Transformers